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Abstract 

The Travelling Salesman Problem is one the most fundamental and most studied 
problems in approximation algorithms. For more than 30 years, the best algorithm known 
for general metrics has been Christofides's algorithm with approximation factor of |, 
even though the so-called Held-Karp LP relaxation of the problem is conjectured to have 
the integrality gap of only |. Very recently, significant progress has been made for the 
important special case of graphic metrics, first by Oveis Gharan et al. 0], and then by 
Momke and Svensson [5]. In this paper, we provide an improved analysis of the approach 
presented in [5] yielding a bound of ^ on the approximation factor, as well as a bound 
of yI + e for any e > for a more general Travelling Salesman Path Problem in graphic 
metrics. 

Subject Classification: approximation algorithms, travelling salesman problem 

1 Introduction and related work 

The Travelling Salesman Problem (TSP) is one the most fundamental and most studied 
problems in combinatorial optimization, and aproximation algorithms in particular. In the 
most standard version of the problem, we are given a metric {V, d) and the goal is to find a 
closed tour that visits each point of V exactly once and has minimum total cost, as measured 
by d. This problem is APX-hard, and the best known approximation factor of | was obtained 
by Christofides [1] more than thirty years ago. However, the so-called Held-Karp LP relaxation 
of TSP is conjectured to have an integrality gap of |. It is known to have a gap at least that 
big, however the best known upper bound [9j for the gap is given by Christofides's algorithm 
and equal to |. 

In a more general version of the problem, called the Travelling Salesman Path Problem 
(TSPP), in addition to a metric {V, d) we are also given two points s,t £ V and the goal is to 
find a path from s to i visiting each point exactly once, except if s and t are the same point 
in which case it can be visited twice (this is when TSPP reduces to TSP). For this problem, 
the best approximation algorithm known is that of Hoogeveen [7] with approximation factor 
of |. However, the Held-Karp relaxation of TSPP is conjectured to have an integrality gap 
of 2- 



*Note that this is a second version of this paper and it has been updated with new results, starting from 
Section U) 
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One of the natural directions of attacking these problem is to consider special cases and 
several attempts of this nature has been made. The most interesting one is by far the graphic 
TSP/TSPP, where we assume that the given metric is the shortest path metric of an undi- 
rected graph. Equivalently, in graphic TSP we are given an undirected graph G = {V, E) and 
we need to find a shortest tour that visits each vertex at least once. Yet another formulation 
would ask for a minimum size Eulerian multigraph spanning V and only using edges of G. 
Similar formulations apply to the graphic TSPP case. The reason why these special cases 
are very interesting is that they seem to include the difficult inputs of TSP /TSPP. Not only 
are they APX-hard (see [5J), but also the standard examples showing that the Held-Karp 
relaxation has a gap of at least | in the TSP case and f in the TSPP case, are in fact graphic 
metrics. 

Very recently, significant progress has been made in approximating the graphic TSP and 
TSPP. First, Oveis Gharan et al. [3] gave an algorithm with an approximation factor | — e for 
graphic TSP. Despite e being of the order of 10~^^, this is considered a major breakthrough. 
Following that, Momke and Svensson [S] obtained a significantly better approximation factor 

of ^^^'^Zll - 1-461 for graphic TSP, as well as factor 3-V2 + e ^ 1.586 + e for graphic TSPP, 
for any e > 0. Their approach uses matchings in a truly ingenious way. Whereas most earlier 
approaches (including that of Christofides [1] as well as Oveis Gharan et al. [3]) add edges of 
a matching to a spanning tree to make it Eulerian, the new approach is based on adding and 
removing the matching edges. This process is guided by a so-called removable pairing of edges 
which essentially encodes the information on which edges can be simultanously removed from 
the graph without disconnecting it. A large removable pairing of edges is found by computing 
a minimum cost circulation in a certain auxiliary flow network, and the bounds on the cost 
of this circulation translate into bounds on the size of the resulting TSP tour/path. 

1.1 Our results 

In this paper we present an improved analysis of the cost of the circulation used in the 
construction of the TSP tour/path. Our results imply a bound of ^ w 1.444 on the approx- 
imation factor for the graphic TSP, as well as a y| + e 1.583 + e bound for the graphic 
TSPP, for any e > 0. The circulation used in [8] consists of two parts: the "core" part based 
on an extreme optimal solution to the Held-Karp relaxation of TSP, and the "correction" 
part that adds enough flow to the core part to make it feasible. We improve bounds on costs 
of both part, in particular we show that the second part is in a sense free. As for the flrst 
part, similarly to the original proof of Momke and Svensson, our proof exploits its knapsack- 
like structure. However, we use the 2-dimensional knapsack problem in our analysis, instead 
of the standard knapsack problem. Not only does this lead to an improved bound, it is 
also in our opinion a cleaner one. In particular, we also provide a supplementary essentially 
matching lower bound on the cost of the core part, which means that any further progress 
on bounding that cost has to take into account more than just the knapsack-like structure of 
the circulation. 

1.2 Organization of the paper 

In the next section we present previous results relevant to the contributions of this paper, in 
particular we recall key deflnitions and theorems of Momke and Svensson [8J . In Section [3] 
we present the improved upper bound on the cost of the core part of the circulation, as well 
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as an almost matching lower bound. In Section [J] we prove that the correction part of the 
circulation is essentially free. Finally, in Section [S] we apply the results of the previous sections 
to obtain improved approximation algorithms for graphic TSP and TSPP. 

2 Preliminaries 

In this section we review some standard results concerning TSP/TSPP approximation and 
recall the parts of the work of Momke and Svensson [8j relevant to the contributions of this 
paper. Note that large parts of the material presented in [8] are omitted entirely or collapsed 
to a single theorem statement. A reader interested in a more detailed and complete exposition 
is advised to read the original paper instead. 

Held-Karp Relaxation and the Algorithm of Christofides. The Held-Karp relaxation 
(or subtour elimination LP) for graphic TSP on graph G = {V, E) can be formulated as follows 
(see [SHHE] for details on equivalence between different formulations): 

min Xe subject to x{5{S)) > 2 for / S C where Xg > 0. 

Here 5{S) denotes the set of all edges between S and y \ S for any S C^V , and x{F) denotes 
Y^edF for any FOE. 

We will refer to this LP as LP{G) and denote the value of any of its optimal solutions by 
OPTlp(G). 

The approximation ratio of the classic |-approximation algorithm for metric TSP due to 
Christofides |T] is in fact related to OPTlp(G) as follows: 

Theorem 2.1. [Shmoys, Williamson J^] The cost of the solution produced by the algorithm 
of Christofides on a graph G is hounded by n+ 0PTlp{G)/2, and so its approximation factor 
is at most 

n + 0PTlp{G)/2 
OPTlp{G) • 

The Held-Karp relaxation can be generalized to the graphic TSPP in a straightforward 
manner. Suppose we want to solve the problem for a graph G = {V, E) and endpoints s, t. 
Let $ = {5 C y : |{s,t} Pi 5| 7^ 1}. Then the relaxation can be written as 

subject to x{5{S)) > 2 for 5 G $ 
x{5{S)) > 1 for 5 $ 
Xe > for e e E 

We denote this generalized program by LP(G, s, t) and its optimum value by OPTip(G, s, t). 
It is clear that OPTlp(G, -t;) = OPTlp(G) for any v eV. 

Let G' = {y,EU {e'}), where e' = {s, t}. Prom any feasible solution to LP{G, s, t) we can 
obtain a feasible solution to LP{G') by adding 1 to Xg'. Therefore 

Fact 2.2. OPTLp{G,s,t) > OPTlp(G') - 1. 
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Reduction to Minimum Cost Circulation. The authors of [8] use the optimal solution of 
LP{G) to construct a low cost circulation in a certain auxiliary flow network. This circulation 
is then used to produce a small TSP tour for G. We will now describe the construction of 
the flow network and the relationship between the cost of the circulation and the size of the 
TSP tour. 

Let us start with the following reduction 



Lemma 2.3 (Lemma 2.1 and Lemma 2.1 (generalized) of Momke and Svensson [8]). // there 
exists a polynomial time algorithm that for any 2-vertex connected graph G returns a graphic 
TSP solution of cost at most r ■ OPTlp{G), then there exists an algorithm that does the same 
for any connected graph. Similarly, if there exists a polynomial time algorithm that for any 
2-vertex connected graph G and its two vertices s,t returns a graphic TSPP solution of cost at 
most r ■ OPTLp{G,s,t), then there exists an algorithm that does the same for any connected 
graph. 

We will henceforth assume that the graphs we work with are all 2-vertex-connected. Let 
G be such graph. We now construct a certain auxiliary flow network corresponding to G. 

Let r be a DFS spanning tree of G with an arbitrary starting vertex r. Direct all edges of 
T (called tree- edges) away from the root, and all other edges (called back- edges) towards the 
root. Let G be the resulting directed graph, and let T be its subgraph corresponding to T. 
Where neccessary to avoid confusion, we will use the name arcs (and tree-arcs and back-arcs) 
for the edges of this directed graph. The flow network is obtained from G by replacing some 
of its vertices with gadgets. 

Let V be any non-root vertex of G having I children: wi, . . . ,wi in T. We introduce I 
new vertices vi,...,vi and replace the tree-arc {v,Wj) by tree-arcs {v,Vj) and {vj,Wj) for 
j = 1,. . . ,1. We also redirect to Vj all the back-arcs leaving the subtree rooted by Wj and 
entering v. We will call the new vertices and the root in-vertices and the remaining vertices 
out-vertices. We will also denote the set of all in-vertices by I. Notice that all the back-arcs 
go from out-vertices to in-vertices, and that each in-vertex has exactly one outgoing edge. 

We assign lower bounds (demands) and upper bounds (capacities) as well as costs to arcs. 
The demands of the tree-arcs are 1 and the demands of the back-arcs are 0. The capacities of 
all arcs are oo. Finally the cost of any circulation / is defined to be Ylvex max(/(i?(t;)) — 1,0), 
where B{v) is the set of incoming arcs of v. This basically means that the cost is for tree- 
arcs and 1 for back-arcs, except that for every in-vertex the first unit of circulation is free. 
The circulation network described above will be denoted G{G,T). For any circulation G, we 
will use \C\ to denote its cost as described above. 

It is worth noting that the cost function of C{G,T) can be simulated using the usual 
fixed-cost edges by introducing an extra vertex v' for each in-vertex v, redirecting all in- arcs 
of V to v' and putting two arcs from v' to v: one with capacity of 1 and cost 0, and the other 
with capacity oo and cost 1. For simplicity of presentation however, we will use the simpler 
network with a slighly unusual cost function. 

Also note that the edges of C{G,T) minus the incoming tree edges of the in-vertices are 
in 1-to-l correspondance with the edges of G. Similarly, all vertices of G{G,T) except for 
the new vertices correspond to the vertices of the original graph. We will often use the same 
symbol to denote both edges or both vertices. 

The main technical tool of ^ is given by the following theorem: 
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Theorem 2.4 (Lemma 4.1 of [8]). Let G be a 2-vertex connected graph, let T he a DFS tree of 
G, and let G* he a circulation in G{G,T) of cost \G*\. Then there exists a spanning Eulerian 
multigraph H in G with at most |ra + ||C*| — | edges. In particular, this means that there 
exists a TSP tour in the shortest path metric of G with the same cost. 

and its generalized version 

Theorem 2.5 (Lemma 4.1 (generalized) of |8j). Let G = {V,E) be a 2-vertex connected graph 
and s, t its two vertices, and let G' = {V, E[j{e'}) where e' = {s, t}. Let T he a DFS tree of G' 
and let C* be a circulation in G{G',T) of cost \G*\. Then there exists a spanning multigraph 
H in G, that has an Eulerian path between s and t with at most |n + ||C*| — | + distG{s,t) 
edges. In particular, this means that there exists a TSP path between s and t in the shortest 
path metric of G with the same cost. 

Remark 2.6. The above theorem is not just a rewording of the generalized version of Lemma 
4.1 from [8j. In our version G* is a circulation in G{G',T) and not G{G,T). Note however, 
that in the proof of Theorem 1.2 of [8] the authors are in fact using the version above, and 
provide arguments for why it is correct. 

In order to be able to apply Theorem 12.41 and Theorem 12.51 the authors of [8j use the 
optimal solution of LP{G) to define a circulation / in G{G,T) as follows. Let G = {V,E) 
be a graph, and let E' = {e € E : x^ > 0}, where x* is an extreme optimal solution of 
LP{G). Let G' = (y,E'). It is clear that x* is also an optimal solution for LP{G'), so an 
r-approximate TSP tour with respect to OVT lp{G') is also r-approximate with respect to 
OVT Lp{G). Therefore, we can always assume that E' = E. The reason why this assumption 
is useful is given by the following theorem. 

Theorem 2.7 (Cornuejols, Fonlupt, Naddef [2]). For any graph G, the support of any extreme 
optimal solution to LP{G) has size at most 2n — 1. 

Thus, we can assume that \E\ < 2n — 1. Moreover, we can assume that G is 2-vertex 
connected because of Lemma 12.31 

We construct a circulation / in G{G,T) as a sum of two ciculations: /' and Let x* 
be, as before, an extreme optimal solution of LP{G). Also, let T used in the construction of 
C{G,T) be the tree resulting from always following the edge e with the highest value of x*. 
The ciculation /' corresponds to sending, for each back-arc a, flow of size min(x*, 1) along 
the unique cycle formed by a and some tree- arcs. The circulation /" is defined in a way that 
guarantees that f = f + f" satisfies all the lower bounds. Let v be an out-vertex and w an 
in-vertex, such that there is an arc {v,w) in G{G,T), and the fiow on {v,w) is smaller than 
1. Also let a be any back-arc going from a descendant of w to an ancestor of v (in T). Such 
arc always exists since G is 2-vertex connected. We push flow along all edges of the unique 
cycle formed by a and tree-arcs until the flow on {v,w) reaches 1. 

The total cost of / can be bounded by 

J]max(/(i?(t;)) - 1,0) < ^ max(/'(i?(z;)) - 1, 0) + J] /" (i?(^;)). 

dGI v&I dSI 

We will denote the sum '^^^x f" i^i^)) which is slightly inconsistent with previous 

deflnitions, but simplifies the notation quite a bit. We thus have |/| < \f'\ + \f"\. 

The authors of [8j provide the following bounds for the two terms of the above expression: 



5 



Lemma 2.8 (Claim 5.3 in [8]). \f"\ < OPTlp{G) - n. 

Lemma 2.9 (Claim 5.4 in [8j). |/'| < (7 - 6\/2)n + 4(^/2 - 1) C>PTlp(G). 

The main theorem of [8j follows from these two bounds 

Theorem 2.10 (Theorem 1.1 in There exists a polynomial time approximation algorithm 
for graphic TSP with approximation ratio ^-^^^ 13 ^ 1-461. 

3 New upper bound for \f'\ 

In this section we describe an improved bound on |/'|. 
Lemma 3.1. 

\f'\<l0PTLP-ln. 

Before presenting our analysis of the cost of /' let us recall some notation and basic 
observations introduced in [8]. For any v € I let t^ be the (unique) outgoing arc of v. 



Fact 3.2. For every in-vertex v, we have |-B(w)| > 



f'(Biv)) 



Proof. Since T was constructed by always following the arc a with the highest value of x*, 
we have that x^^ > Xa for any a G B{v) and the claim follows. □ 

Decompose f'{B{v)) into two parts: 1^ = min(2 — x^^, f {B{y))) and u„ = f'{B{v)) — l^. 
The intuition here is that the higher m„ is, the larger OPT/,p(G) is. In particular, if we let 
^* = Yjv&:'^v, then 

Fact 3.3 (Stated in the proof of Claim 5.4 in [S]). 

u* < 2{0PT LP{G) -n). 

Proof. Consider a vertex v oiG which (in the construction of C{G, T)) is replaced by a gadget 
with a set ly of in- vertices, and let x*{v) be the fractional degree of v in x* . Since for any 
w £ ly, the tree-arc tw and all the back-arcs entering w correspond to edges of G incident 
to V, each such w contributes at least 2 + u^, to x*{v), provided that u^, > (if u^; = we 
cannot bound w's contribution in any way). Since we also know that x*{v) > 2 (this is one 
of the inequalities of the Held-Karp relaxation) , we get the following bound 

x*iv)>maxi2, ^ (2 + ^i^„) > 2 + ^ u^. 

Summing this over all vertices we get 20PTlp{G) > 2n + u* , and the claim follows. □ 
Because of Theorem 12.71 and Fact 13.21 we have 

ly -\- Uy 



E 



min(l, Xf^] 



< n. 
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Also note that in terms of 1^ and Uy the total cost of /' is given by the following formula 

max(0, ly + Uy — 1). 

vex 

Our goal is to bound this cost as a function of n and u* . Instead of working directly with G 
and the solution x* to the corresponding LP{G), we abstract out the key properties of x^^, 
ly and Uy and work in this restricted setting. 

Definition. A configuration of size n is a triple (x, I, u), where x,l,u : {1, . . . ,n} M>o such 
that: 

1. 0<Xi<l, 

2. /j < 2 — Xi, and 

3. Ui > =^ li = 2 — Xi 
hold for alH = 1, . . . , n. 

Definition. Let C = {x,l,u) be a configuration. We will say that the i-th. element of C uses 
gj^gg g^^^ denote this number by ej(C), or Cj if it is clear what C is. We will also say 
that C uses Y17=i ^* edges. 

Also, the value of C is defined as val(C) = niax(0, li + Uj — 1). 

Remark 3.4. The values .Xj, li and Ui correspond to xt^,, /,„ and Uy, respectively. The 
properties enforced on the former are clearly satisfied by the latter with the exception of the 
inequalities Xi < 1. The reason for introducing these inequalities is the following. Without 
them, the natural definition of the number of edges used by the i-th element of C would be 

k + Uj 

min(a;i, 1) 

However, in that case, for any configuration C there would exists a configuration C with 
val(C") < val(C) and < 1 for all z = 1, . . . , n. In order to construct C simply replace 
all Xi > 1 with ones. If as a result we get li < 2 — Xi and Ui > for some i, simultanously 
decrease Ui and increase li until one of these inequalities becomes an equality. 

For that reason, we prefer to simply assume Xi < 1 and be able to use a (slightly) simpler 
definition of e^. As we will see, the inequalities Xi < 1 turn out to be quite useful as well. 

We denote by CONF(n,tt*) the set of all configurations {x,l,u) of size n such that 
J27=i Ui = u*. We also use OPT(n, u*) to denote any maximum value element of CONF(n,'u*), 
and VAL(n,'U*) to denote it's value. It is easy to see that 

Fact 3.5. I/'I < YAL{n,u*). 

Notice that determining VAL(n, u*) for given n and u* is a 2-dimensional knapsack prob- 
lem. Here, items are the possible triples (xj, li,Ui) satisfying the configuration definition. The 
value of such a triple is equal to max(0, li + Ui — 1), i.e. it's contribution to the configuration 
value, if used in one. Also, the ,,mass" of {xi,li,Ui) is Ui and it's ,, volume" is e^. We want to 
maximize the total item value, while keeping tht total mass < u* and total volume < n. 



7 



Lemma 3.6. For any n E N, G M>o, there exists an optimal configuration in CONF{n, u*) 
such that: 

1- ei = for alli=l,...,n, 

2. (/j = 0) V (/j = 2 - Xi) for aUi=l,...,n. 

Proof. We prove each property by showing a way to transform any C G CONF (n,n*) into 
C E CONF(n,n*) such that val(C") > val(C) and C satisfies the property. 

Let us start with the first property, which basically says that all edges are fully saturated. 
Assume we have Cj > ^'^""^ for some i £ {1, . . . ,n}. If /j < 2 — Xj, we increase li until either 

Cj = ^'t^' , in which case we are done, or = 2 — Xj. In the second case we start decreasing Xj 

while increasing li at the same rate, until Ci = . Clearly, both transformations increase 

Xi 

the value of the configuration and keep both Ui and Cj unchanged. 

To prove the second property, let us assume that for some i G {1, . . . , n} we have < li < 
2 — Xj. We also assume that our configuration already satisfies the first property, in particular 
we have Cj = ^ (ttj = since k < 2 — Xi). We increase Xj and keep k = eiXi until li + Xi = 2. 
This increases the value of the configuration and keeps Ui and Cj unchanged. □ 

Theorem 3.7. For any n E N, G ]R>o, and any C £ CONF{n,u*) we have val{C) < 
u* + i(n - u*). 

Proof It is enough to prove the bound for optimal configurations satisfying the properties in 
Lemma 13.61 Let C be such a configuration. We will prove that for all i = 1, . . . , n we have: 

Vi = max(0, h + Ui - 1) < Ui + \{ei - Ui). 

D 

Summing this bound over all i gives the desired claim. 

If Ui = li = Ci = 0, then the bound clearly holds. It follows from Lemma [3.61 that the only 
other case to consider is when li = 2 — Xi and Ci = . It follows from these two equalities 

X^i 

that CiXi = li + Ui = 2 — Xi + Ui and so 

_ 2 + Ui 

Using this expression to bound Vi we get 

Vi < k + Ui - I = 2 - Xi + Ui - 1 = I + Ui - Xi = \ + Ui 
We need to prove that 

l-{ei-Ui) 1 

Ui — <Ui + -[Ci - Ui), 

l + ei 6 

or equivalently 

(ci -Ui)(- —] + > 0. 

Since Ui < Ci (this follows from property 1 in Lemma 13.61 and the fact that Xj < 1), we have 
two cases to consider. 



2 + Ui I - [Ci- Ui) 

= Ui 

l + ei l + ei 
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Case 1: g — > 0. In this case the whole expression is clearly nonnegative. 

Case 2: i — < 0, meaning that G {1,2,3,4}. In this case we proceed as follows: 

1 1 \ , 1 f I 1\ Ci Ci-l 

+ r—. = Ui — + 



6 1 + CiJ 1 + ej \1 + ei 6/ 6 ej + 1 

The first term is clearly nonnegative and the second one can be checked to be nonneg- 
ative for Ci G {1, 2, 3, 4}. Note that integrality of Cj plays a key role here, as the second 
term is negative for Cj G (2, 3). 

□ 

We can show that the above bound is essentially tight 

Theorem 3.8. For any n G N, n* G M>o, there exists C G CONF{n,u*) such that val{C) = 
u* + \{n-u*)-0{l). 

Proof. It is quite easy to construct such C by looking at the proof of Theorem 13.71 We get 
the first tight example when, in Case 2 of the analysis, we have Uj = and G {2, 3}. This 
corresponds to configurations consisting of elements of the form: 

• Xj = = |,Mj = 0, in which case we have = 2 and so Ui + |(ej — tij) = | and 
Vi = li + Ui-l = \, ox 

• Xj = ^, = |, Uj = 0, in which case we have = 3 and so Ui + |(ej — Ui) = | and 

Vi=li + Ui-l = \. 

Using these two items we can construct tight examples for n* = and arbitrary n>2. 

To handle the case of n* > we need another (almost) tight case in the proof of Theo- 
rem 13.71 which occurs when Ui is close to and is relatively large. In this case the value 
of the expression (e^ — Uj) — + is clearly close to 0. This corresponds to using 

items of the form Xi = = 1 and arbitrary Ui. For such elements we have = \ui + 1] 
and so 

Ui + ^(ej - Ui) <Ui + 

D 3 

and 

Vi = k + Ui - 1 = Ui, 

so the difference between the two is at most ^. By combining the three types of items 
described, we can clearly construct C as required for any n and u*. □ 

We are now ready to prove the Lemma l3. 11 

Proof (of Lemma \3.1\) . It follows from Theorem 13.71 and Fact 13.51 that 

\f'\<u* + hn-u*) = lu* + \n. 





Using Fact 13.31 we get: 



in < I ■ 2(0PTiP -n) + \n = \oVTlp - ^n. 
D 3 2 



□ 
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4 New upper bound for \f"\ 

In this section we give a new bound for | /" | . We do not bound directly, as in 12.81 Instead, 
we show the following. 

Lemma 4.1. 

in < U'^OPTlp{G) - 2n - u*) ■ 
b 

What this says is basically that /" can be fully paid for by the overlay we get in Fact 13.31 
To better understand this bound, and in particular the constant |, before we proceed to prove 
it, let us first show how it can be used. 

Corollary 4.2. |/| < ^OPTlp - |n. 

Proof. We have |/| < |/'| + \f"\ < |u* + |n + | (20PTlp - 2n - u*) = |OPTlp - |n. □ 

There are several interesting things to note here. First of all, we got the exact same 
bound as in Lemma 13.11 which means that |/"| can be fully paid for by the overlay in 
Fact 13.31 suggested earlier. In particular, this means that improving the constant | in 
Lemma |4. II is pointless, since we would still be getting the same bound on |/| when \f"\ = 0. 
Therefore, we do not try to optimize this constant, but instead make the proof of the Lemma 
as straightforward as possible. 

Let us now proceed to prove Lemma 14.11 For any non-root in- vertex w let = x^^ + 
x*{B{w)). Basically, if v is the father of w in T, then Zyj is the total value of x* over all edges 
connecting v with vertices in the subtree of T determined by w. Also, let be the total 
of X* over all edges connecting vertices in with vertices above v. 

We can formulate the following local version of Lemma 14.11 

Lemma 4.3. For every non-root vertex v of G we have 

max(0, 1 - Ew) <^{x* {v) - 2 - ^ j . 

Notice that Lemma 14.11 easily follows from Lemma 14.31 by summing over all non-root 
vertices. 

Proof (of Lemma \4.3^ . Let u be a non-root vertex of G. We define 3 types of vertices in X^: 

• w £ heavy if < 1 and > 2 

• w £ ly IS light if < 1 and < 2, 

• w £ ly is trivial otherwise (i.e. > 1). 

We denote by and Ly the sets of heavy and light vertices in X^, respectively. Intuitively, 
heavy vertices are the ones that contribute to both u* and |/"|, light vertices contribute only 
to I /"I, and the remaining vertices are trivial. 
We are going to use the following observations: 

1. Zw > 2 — Ew for all w £ HyU Ly, 
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2. x*{v) > J2weHyULy + niax(0, 2 - Eff.uL. ^w)- 



The first observation follows from the Held-Karp inequality for the cut induced by the subtree 
Tu, of T determined by w. The second follows from Held-Karp inequality as well, this time 
for the cut induced by the set UtoS-ff^uL^ U {v}. The only edges crossing this cut are the 
back-edges with total x* value X^jj^y^^ and edges incident to v, but not to a vertex from 
a subtree induced by one of u; G i?^ U L^. The second term in the second observation is a 
lowerbound on the total x* value of this second kind of edges. 

Note that the trivial vertices might have > 2 and so contribute to u* . However in that 
case the proof is quite simple and it will be advantageous for us to get it out of our way. Let 
wq be a trivial vertex with > 2. What we do is basically use this vertex to cancel out the 
lone 2 in the RHS of the bound: 



WElv wElv\'Wo wETy\wo wElv\wo 

Since wq ^ U Ly we thus have 



ix*{v) -2- ^ Uyj] > ^ ^{zw-Uw)> ^ (1-e^). 



The last inequality holds because wc have z^, — Uy,, = 2 for heavy w and Zw — Uw = Zw >2 — £y] 
for light w. We can thus assume that all trivial vertices have z^ <2 (and so = 0). 
Note that using our observations, we can reformulate our claim as follows: 

and since we now assume that trivial vertices have z.^ < 2, it is enough to prove: 

(since Zw = 2 + u.^ ior w E H^). 

Clearly, if all w € 1^ are trivial, both sides of the bound are and so it trivially holds. 
Otherwise, we consider the following two cases: 

Case 1: J2weHvULv 2. Notice that this implies \Hy\ + \Ly\ > 3. In this case the RHS 
of the bound becomes 

^(^z^ + 2{\Hy\-l)\ >^|'^(2-£,)-f2(|i/,|-l)y 

The ratio of the above expression and the LHS is lowerbounded by the ratio of these 
same expressions with all = 0, i.e. | • ^^^j^^ | ' which is definitely at least 1, 
since \LJ + \HJ > 3. 
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Case 2: X^u^gj^^uL^ — 2- ^^^^ ^^^^ RHS of the bound becomes 



(^z^ + 2- J2 + 2(1^^.1 - l)j > ^ ( J^(2-2e^)+ (2- e^)] 



The claim now follows by observing that (2 — 2e^) = 2(1 — e^) and 2 — e^^, > 2(1 — e^„). 

□ 

5 Applications to graphic TSP and TSPP 

As a consequence of Corollary I4.2[ we get improved approximation factors for graphic TSP 
and graphic TSPP. 

Theorem 5.1. There is a ^-approximation algorithm for graphic TSP. 
Proof. Using the bound of Corollary 14.21 we get 

|/|<^OPTiP-^n. 

The TSP tour guaranteed by Theorem 12.41 has size at most 

Notice that the approximation ratio of the resulting algorithm is getting better with OPTj^p 

increasing (with fixed re). Therefore the worst case bound is the one we get for OPT^p = re, 
ie 10 + 1 = 13 □ 

Remark 5.2. This analysis is significantly simpler than the one in [Q. Balancing with 
Christofides's algorithm is no longer necessary since bounds on approximation ratios for both 
algorithms are decreasing in OPT^^p. 

Theorem 5.3. There is a ^ + e- approximation algorithm for graphic TSPP, for any e > 0. 

Proof. This proof is very similar to the proof of Theorem 1.2 in However, the reasoning 
is slighly simpler, in our opinion. Suppose we want to approximate the graphic TSPP in 
G = {V, E) with endvertices s and t. Let G' = {V,E[j{e'}), where e' = {s, t}, and let OPT^p 
denote OVT ip{G'). Also, let d be the distance between s and t in G. Using the bound of 
Corollary 22] we get 

|/|<^OPTz.p-^n. 

The TSP path guaranteed by Theorem 12.51 has size at most 

4 2,^, 2 d 4 2/5^^^ 3 \ 2 d 10^^^ n + d-2 
-n + -\f\-- + -<-n + -[ -OVTlp - i^n] - - + - = —OVTlp + 



3 3 '•"33-3 3V3 2733 9 3 

It is clear that the quality of this algorithm deteriorates as d increases. We are going to balance 
it with another algorithm that displays the opposite behaviour. The following approach is 
folklore: Find a spanning tree T in G and double all edges of T except those that lie on the 



12 



unique shortest path connecting s and t. The resulting graph has a spanning Eulearian path 
connecting s and t with at most 2(n — 1) — d edges. 

Since OPT^p — 1 < OPTip{G, s,t) is a lower bound for the optimal solution, the two 
approximation algorithms have approximation ratios bounded by 
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OPTlp + ^^±1^ 



OPTlp - 1 

and 

2n - 2 - d 



OFTlp - 1 

For a fixed value of OPTlp the first of these expressions is increasing and the second is 
decreasing in d. Therefore the worst case bound we get for an algorithm that picks the best 
of the two solutions occurs when 

10^^^ n + d-2 

—OPTlp + = 2n-2-d, 

which leads to 

d=l^- IoFTlp - 1. 

4 D 

For this value of d the approximation ratio is at most 

2n - 2 - (f n - fOPT^p - l) _ f - 1 + fOPT^P _ f - | 5 
OPTlp -1 OPTlp - 1 OPTlp-1 6' 

Since OPT^p > n this is at most 



n — l 6 4 6 V"/ -'-2 V"/ 
which proves the claim. □ 
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